A Dynamic and Compact Fault-Tolerant Strategy for Fat-tree

نویسندگان

  • C. Gómez
  • P. López
  • J. Duato
چکیده

Abstract. High-performance interconnection networks are used to achieve the maximum performance in clusters of PCs. The fat-tree topology has raised in popularity in the last few years. Routing and fault-tolerance are two important design issues of these networks. This paper presents a dynamic fault-tolerant routing strategy for fat-trees without additional hardware, supporting a reasonable number of faults. After a fault, an algorithm dynamically spreads the routing information, with a small impact on global system throughput. The proposal is based on enhancing the Interval Routing scheme with exclusion intervals to indicate the nodes that become unreachable after a fault appears.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Design of Fault Tolerant Comparator

In this paper we have presented a new design of fault tolerant comparator with a fault free hot spare. The aim of this design is to achieve a low overhead of time and area in fault tolerant comparators. We have used hot standby technique to normal operation of the system without interrupting and dynamic recovery method in fault detection and correction. The circuit is divided to smaller modules...

متن کامل

Research on Safety Risk of Dangerous Chemicals Road Transportation Based on Dynamic Fault Tree and Bayesian Network Hybrid Method (TECHNICAL NOTE)

Safety risk study on road transportation of hazardous chemicals is a reliable basis for the government to formulate transportation planning and preparing emergent schemes, but also is an important reference for safety risk managers to carry out dangerous chemicals safety risk managers. Based on the analysis of the transport safety risk of dangerous chemicals at home and abroad, this paper studi...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Bridging the gap between Fault Tree Analysis Modeling Tools and the Systems being Modeled

Fault tolerant systems comprise of subsystems that interact with each other in complex ways [Joh89]. As a result, modeling the reliability of these systems calls for sophisticated analytical techniques. A powerful technique to address this issue is dynamic fault tree analysis [Dug92]. But because the semantics on which Dynamic Fault Trees are based are themselves complex, there was a question o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006